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Abstract 

It is well-known that a random g-ary code of rate f2(e 2 ) is list decodable up to radius (1 — 1/q — e) 
with list sizes on the order of 1/e 2 , with probability 1 — o(l). However, until recently, a similar statement 
about random linear codes has until remained elusive. In a recent paper, Cheraghchi, Guruswami, and 
Velingker show a connection between list decodability of random linear codes and the Restricted Isometry 
Property from compressed sensing, and use this connection to prove that a random linear code of rate 
fi(e 2 / log 3 (1/e)) achieves the list decoding properties above, with constant probability. We improve on 
their result to show that in fact we may take the rate to be H(e 2 ), which is optimal, and further that 
the success probability is 1 — o(l), rather than constant. As an added benefit, our proof is relatively 
^ ■ simple. Finally, we extend our methods to more general ensembles of linear codes. As an example, we 

^ ' show that randomly punctured Reed-Muller codes have the same list decoding properties as the original 

codes, even when the rate is improved to a constant. 

1 Introduction 

| In the theory of error correcting codes, one attempts to obtain subsets (codes) C C [q] n which are simulta- 

neously large and "spread out." If the rate of the code R = \og q \C\/n is large, then each codeword c e C 
^vq contains a large amount of information. On the other hand, if the distance between any two codewords is 

large, then even if a codeword becomes corrupted, say, a fraction p of its entries are changed, the original 
codeword may be uniquely recovered. There is a trade-off between the rate and distance, and sometimes 
this trade-off can be too harsh: it is not always necessary to recover exactly the intended codeword c, and 
sometimes suffices to recover a short list of L codewords. This relaxed notion, called list decoding, was intro- 
duced in the 1950's by Elias |Eli57j and Wozencraft |Woz58| . More formally, a code C is (p,L)-list decodable 
if, for any received word w, there are at most L other codewords within relative distance p of w. 

We will be interested in the list decodability of random codes, and in particular random linear codes. A 
linear code in F™ is a code which forms a linear subspace of F™, of dimension k. Unless otherwise noted, 
a random linear code will be a uniformly random linear code, where C is formed by choosing the subspace 
uniformly at random; equivalently, C is the F™-span of k uniformly random vectors in F™. 

Understanding the trade-offs in list decoding is interesting not just for communication, but also for a 
wide array of applications in complexity theory. List decodable codes can be used for hardness amplification 
of boolean functions, constructing hardcore predicates from one-way functions, and they can be used to 
construct randomness extractors, expanders, and pseudorandom generators. (See the surveys Sud00 ( Vadll 
for these and many more applications). Understanding the behavior of linear codes, and in particular random 
linear codes, is also of interest: decoding a random linear code is related to they problem of learning with 
errors, a fundamental problem in both learning theory [BKW03, FGKP06] and cryptography Reg05 . 

In this work, we show that for large error rates p, a random linear code has the optimal list decoding 
parameters, improving upon the recent result of Cheraghchi, Guruswami, and Velingker CG V13] . Our 
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result establishes the existence of such codes, previously unknown for q > 2. We extend our results to other 
(not necessarily uniform) ensembles of linear codes, including random families obtained from puncturing 
Reed-Muller codes. 



1.1 Related Work 

In this paper, we will be interested in large error rates p = (1 — 1 /q) (1 — e), for small e. Since a random 
word r € F™ will disagree with any fixed codeword on a 1 — 1/q fraction of symbols in expectation, this is the 
largest error rate we can hope for. This large-p regime is especially of interest for applications in complexity 
theory, so we seek to understand the trade-offs between the achievable rates and list sizes, in terms of e. 

When p is constant , Guruswami, Hastad, and Kopparty [GHKll] show that a random linear code of rate 
1 — H q (p) — C Pt q/ L is (p, L)-list decodable, where H q {x) — x\og q {q — 1) — x\og q {x) — (1 — x ) log g (l — x) is the 
g-ary entropy. This matches lower bounds of Rudra and Guruswami-Narayanan |RudllllGN12j . However, 
for p = (1 — 1 /q) (1 — e), the constant C Ptq depends exponentially on e, and this result quickly degrades. 

When p = (1 — 1 /q) (1 — e), it follows from a straightforward computation that a random (not necessarily 
linear) code of rate fl(e 2 ) is ((1 — 1 /q) (1 — e) , 0(l/e 2 ))-list decodable. However, until recently, the best 
upper bounds known for random linear codes with rate ^(e 2 ) na< i list sizes exponential in 1/e |ZP81| : 
closing this exponential gap between random linear codes and general random codes was posed by |Eli91j . 
The existence of a binary linear code with rate ^(e 2 ) and list size 0(l/e 2 ) was shown in |GHSZ02"j . However, 
this result only holds for binary codes, and further the proof does not show that most linear codes have this 
property. Cheraghchi, Guruswami, and Velingker (henceforth CGV) recently made substantial progress on 
closing the gap between random linear codes and general random codes. Using a connection between list 
decodability of random linear codes and the Restricted Isometry Property (RIP) from compressed sensing, 
the proved the following theorem. 

Theorem 1. [Theorem 12 in [CGV 13}] Let q be a prime power, and let £,7 > be constant parameters. 
Then for all large enough integers n, a random linear code C C¥ q of rate R, for some 



log(l/ 7 )log 3 (g/e) log(g) 

is ((1 — 1 /q) (1 — s) , O '(l/e 2 )) -list decodable with probability at least 1 — 7. 

It is known that the rate cannot exceed 0(e 2 ) (this follows from the list decoding capacity theorem). 
Further, the recent lower bounds of Guruswami and Vadhan [GVlOj and Blinovsky [Bli05,Bli08 show that 
the list size L must be at least fl q (l/s 2 ). Thus, Theorem [T] has nearly optimal dependence on e, leaving a 
polylogarithmic gap. 



1.2 Our contributions 

The extra logarithmic factors in the result of CGV stem from the difficulty in proving that the RIP is likely 
to hold for randomly subsampled Fourier matrices. Removing these logarithmic factors is considered to be 
a difficult problem. In this work, we show that while the RIP is a sufficient condition for list decoding, it 
may not be necessary. We formulate a different sufficient condition for list decodability: while the RIP is 
about controlling the £2 norm of for a matrix $ and a sparse vector x with ||x||2 = 1, our sufficient 
condition amounts to controlling the l\ norm of $2; with the same conditions on x. Next, we show, using 
techniques from high dimensional probability, that this condition does hold with overwhelming probability 
for random linear codes, with no extra logarithmic dependence on e. The punchline, and our main result, is 
the following theorem. 

Theorem 2. Let q be a prime power, and fix e > 0. Then for all large enough integers n, a random linear 
code C C¥ q of rate R, for 

e 2 

R>C 



log(g) 
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is ((1 — Yq) (1 — e) , 0(1/ 1 e 2 ))-list decodable with probability at least 1 — o(l). Above, C is an absolute constant. 

There are three differences between Theorem [1] and Theorem [5] First, the dependence on e in Theorem 
[2] is optimal. Second, the dependence on q is also improved by several log factors. Finally, the success 
probability in Theorem [5] is 1 — o(l), compared to a constant success probability in Theorem [TJ As an 
additional benefit, the proof on Theorem [5] is relatively short, while the proof of the RIP result in |CGV13j 
is quite difficult. 

To demonstrate the applicability of our techniques, we extend our approach to apply to not necessarily 
uniform ensembles of linear codes. We formulate a more general version of Theorem[2j and give examples of 
codes to which it applies. Our main example is linear codes E of rate ^(e 2 ) whose generator matrix is chosen 
by randomly sampling the columns of a generator matrix of a linear code C of nonconstant rate. Ignoring 
details about repeating columns, £ can be viewed as randomly punctured version of C. Random linear codes 
fit into this framework when C is taken to be RM g (l,fc), the q — ary Reed-Muller code of degree 1 and 
dimension k. We extend this in a natural way by taking C = RM(r, m) to be any (binary) Reed-Muller code. 
It has recently been shown |GKZ081lKLPT2) that RM(r, m) is list-decodable up to 1/2 — e, with exponential 
but nontrivial list sizes. However, RM(r, m) is not a "good" code, in the sense that it does not have constant 
rate. In the same spirit as our main result, we show that when RM (r,m) is punctured down to rate 0(e 2 ), 
with high probability the resulting code is list decodable up to radius 1/2 — e with asymptotically no loss in 
list size. 

1.3 Our approach 

The CGV proof of Theorem [T] proceeds in three steps. The first step is to prove an average case Johnson 
bound — that is, a sufficient condition for list decoding that depends on the average pairwise distances between 
codewords, rather than the worst-case differences. The second step is a translation of the coding theory 
setting to a setting suitable for the RIP: a code C is encoded as a matrix $ whose columns correspond to 
codewords of C. This encoding has the property that if $ had the RIP with good parameters, then C is 
list decodable with similarly good parameters. Finally, the last and most technical step is proving that the 
matrix $ does indeed have the Restricted Isometry Property with the desired parameters. 

In this work, we use the second step from the CGV analysis (the encoding from codes to matrices), but 
we bypass the other steps. While both the average case Johnson bound and the improved RIP analysis 
for Fourier matrices are clearly of independent interest, our analysis will be much simpler, and obtains the 
correct dependence on e. 

1.4 Organization 

In Section [2] we fix notation and definitions, and also introduce the simplex encoding map from the second 
step of the CGV analysis. In Section[3l we state our sufficient condition and show that it implies list decoding, 
which is straightforward. We take a detour in Section 13.11 to note that the sufficiency of our condition in 
fact implies the sufficiency of the Restricted Isometry Property directly, providing an alternative proof of 
Theorem 11 in [CGV13] . In Section 2] we prove that our sufficient condition holds, and conclude Theorem 
[2j Finally, in Section [5J we discuss the generality of our result, and show that it applies to other ensembles 
of linear codes. 

2 Definitions and Preliminaries 

Throughout, we will be interested in linear, g-ary, codes C with length n. The size of C will be N = \C\, 
and the dimension will be k = log q (N). We use the notation [q] = {0, . . . ,q — 1}, and for a prime power q, 
V q denotes the finite field with q elements. Nonlinear codes use the alphabet [q], and linear codes use the 
alphabet F g . When notationally convenient, we identify [q] with ¥ q : for our purposes, this identification may 
be arbitrary. We let u = e 27ri / g denote the primitive q th root of unity, and we use £_l C {0, 1}^ to denote 



3 



the space of L-sparse binary vectors. For two vectors x,y £ [q] n , the relative Hamming distance between 
them is 

d(x,y) = : Xi ^ yi}\ . 

n 

Throughout, Cj denotes numerical constants. For clarity, we have made no attempt to optimize the values 
of the constants. 

A code is list decodable if any received word w does not have too many codewords close to it: 

Definition 3. A code C C [q] n is (p,L)-list decodable if for all w £ [q] n , 

\{c£C : d(c,w) < p}\ < L. 

A code is linear of dimension k if the set C of codewords is of the form C = {xG | x £ F^}, for a k x n 
generator matrix G. We say that C is a random linear code if the generator matrix G is chosen uniformly at 
random from F^ x ". 

We make use the simplex encoding used in the CGV analysis, which maps the code C to a complex matrix 
Definition 4 (Simplex encoding from |CGV13j ). 

Define a map ip : [q] — » C 9_1 by ip(x)(a) — to xa for a £ {l,...,q — 1}. We extend this map to a map 
if : [q] n — > C™^ -1 -* in the natural way by concatenation. Further, we extend tp to act on sets C C [q] n ■' f(C) 
is the n{q — 1) x \C\ matrix whose columns are c^(c) for c £ C. 

Suppose that C is a random g-ary linear code with length n, size N, and dimension fc, as above. Consider 
the n x N matrix M which has the codewords as columns. The rows of this matrix are independent — each 
row corresponds to a column t of the random generator matrix G. To sample a row r, we choose t £ F^ 
uniformly at random (with replacement), and let r = ((£, x)) xeV k . Let T denote the random multiset with 

elements in F^ consisting of the draws t. To obtain $ = ip(C), we replace each symbol (3 of M with its 
simplex encoding y(/3), regarded as a column vector. Thus, each row of $ corresponds to a vector t £ T (a 
row of the original matrix M, or a column of the generator matrix G), and an index a £ {1, . . . , q — 1} (a 
coordinate of the simplex encoding). We denote this row by / tjQ . 

We use the following facts about the simplex encoding, also from CGV13J: 

1. For x, y £ [q] n , 

(ip(x),ip{y)) = (q-l)n- qd(x,y)n. (1) 

2. If C is a linear code, the columns of $ are orthogonal in expectation. That is, for x,y £ F™, indexed 
by i, j £ Fg respectively, we have 



Ed(x,y) = -E Vl 

71 ' 



teT 



U-\ its 

1 1=3 



Combined with (TTJ), we have 

E (tp(x),tp(y)) =(q- l)n - qnEd(x,y) 



This implies that 



(g-l)n x = y 
x ^ y 



E||$x||i= V XiXjE (ip(ci),ip(cj)) = (q— l)n||x|| 2 . (2) 



E 

i,j£[N] 
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3 Sufficient conditions for list decodability 

Suppose that C is a linear code as above, and let $ = (f(C) € (T^n(q-i)xN ^ e ^ ne com pi ex matrix associated 
with C by the simplex encoding. We first translate Definition [3] into a linear algebraic statement about $. 
The identity (JIJ implies that C is (p, L — 1) list decodable if and only if for all w £ F™, for all sets A C C 
with |A| = L, there is at least one codeword c £ A so that d(w, c) > p, that is, so that 

(<p(c),<p(w)) <(q- l)n - qpn. 

Translating the quantifiers into appropriate max's and min's, we observe 

Observation 5. A code C is {p,L — \)-list decodable if and only if 

max max min {(p(w), <p(c)) < (q — l)n — qpn. 
w&[q] n AcC,|A|=z ceA 

When p = (1 — 1 /q) (1 — e), C is (p, L — l)-list decodable if and only if 

max max min Up{w), f(c)) < (q — line. (3) 

w£[q] n AcC,|A|=L c£A 

We seek sufficient conditions for Below is the one we will find useful: 
Lemma 6. Let C be a q-ary linear code of length n and size N , and let $ = <p(C) as above. Suppose that 

— max ||$x||i < (q — l)ns. (4) 

Then ^ holds, and hence C is ((1 — 1 /q) (1 — e) ,L — \)-list decodable. 
Proof. We always have 



- (<fi(w), <p(c)) < jYl ' 



so 



mm r . ,. T .. ., 
ceA L 

cGA 

max max min (ipfw), ip(c)) < — max max 'Y^ (wOw), ip(cj) 

we[q] n IAI=i ceA L w£[q] n \A\=L ^-^ 

cG A 

1 , NT 

= — max max ip(w) $a; 

L w£[q] n i£Sl 

< — max llyfiullloo max ||$a;||i 

L we[q] n xE'Sl 

= 7 max 11*^11 1- 

Thus it suffices to bound the last line by (q — l)ne. □ 
3.1 Aside: the Restricted Isometry Property 

A matrix A has the Restricted Isometry Property (RIP) if, for some constant S and sparsity level s, 

(i-*)W5<||Ac||l<(i + <y)Nll 

for all s-sparse vectors x. The best constant 5 = 5{A 1 k) is called the Restricted Isometry Constant. The 
RIP is an important quantity in compressed sensing, and much work has gone into understanding it. 

CGV have shown that if , <f(C) has the RIP with appropriate parameters, C is list decodable. The 

y/n(q-l) 

proof that the RIP is a sufficient condition follows, after some computations, from an average-case Johnson 
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bound. While the average-case Johnson bound is interesting on its own, in this section we note that Lemma 
[6] implies the sufficiency of the RIP immediately. Indeed, by Cauchy-Schwarz, 



— max ||$x||i < — — ^ max ||$x||2 

L L i£Si 



< 



y/n{q - 


1) 


L 


\/n{q - 


1) 


L 




n(q-l] 





< l) (y/n(q-l)(l + S) max \\x\\ 2 ) 



where § = 5($.L) is the restricted isometry constant for $ = . $ and sparsity L. By Lemma HI this 

y/n{q-l) 

implies that 

5 + 1 
— =- < e 
sTL 

also implies (JU), and hence ((1 — 1 /q) (1 — e) ,L— l)-list decodability. Setting 5 = 1/2, we may conclude the 
following statement: 

For any code C C la]", if , 1 f(C) has the RIP with contant 1/2 and sparsity level L, then 
C is ((1 - y q ) (1 - 3/ 2vT ) , L - l)-list decodable. 
This precisely recovers Theorem 11 from [CGV13] . 

4 A random linear code is list decodable 

We wish to show that, when $ = <p(C) for a random linear code C, (HJ) holds with high probability, so we 
need to bound max l6 s 1 ||$x||i. We write 

max ||$x||i < max E||$a;||i + max |||$a;||i - Ej|$a;j|i| , (5) 

i£S L i6S t i£Sl 

and we will bound each term separately. First, we observe that E||$a;||i is correct. 

Lemma 7. Let C C F™ be a random linear q-ary code of dimension k, and let N — 2 k . Let $ = <p(C) as 
above. Then for any x € 

<- nJs 7T- 

Proof. The proof is a straighforward consequence of (j2J . For any x € Sl , we have 

E||$x||i < y/n(q-l)E\\$x\\ 2 



,V2 



< y/n(q-l)(E\\$x\\Z) 
= n(q - l)y/L 

using @ and the fact that 1 1 1 1 2 = \ L. □ 

Next, we control the deviation of ||$x||i from E||$x||i, uniformly over x G Si,. We do not require the 
vectors tj be drawn uniformly at random anymore, so long as they are selected independently. 

Lemma 8. Let C C F™ be q-ary linear code, so that the columns t\, . . . ,t n of the generator matrix are 
independent. Then 

-Emax |||$x||i -E||$x||i| < C (q- l)y/nln(N) 

L 

with probability 1 — l/poly(A r ), for an absolute constant Cq. 
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Remark 1. As noted above, we do not make any assumptions on the distribution of the vectors t\,. . . ,t n , 
other than that they are chosen independently. In fact, we do not even require the code to be linear — it is 
enough for the vectors Vi = (c(i)) c eC £ [q] to be independent. However, as we only consider linear codes in 
this work, we stick with our statement in order to keep the notation consistent. 

As a warm-up to the proof, which involves a few too many symbols, consider first the case when q = 2, 
and suppose that we wish to succeed with constant probability. Then the rows ft of $ are rows of the 
Hadamard matrix, chosen independently. By standard symmetrization and comparison arguments (made 
precise below), it suffices to bound 

— E max g t (ft, x) — — E max (g, <£>ie) 



< E max 



= E max (g,y), 

where above g = (31,(72, • • • 3 9n) 1S a vector of i.i.d. standard normal random variables, and denotes the 
l\ ball in M. N . The last line is the mean width of , which is a polytope contained in the convex hull of 
±</j(c) for c £ C, (that is, the columns of $ and their opposites). So, using estimates for Gaussian random 
variables [LT9T1 Eq. (3.13)], 

E max (g, y) = E max (g, <p(c)) 



<3^i\C\y/E (g,<p(c)y 
= 3||c|| 2V /Ioi(A0 



which is what we wanted. 

For general q and failure probability o(l), there is slightly more notation, but the proof idea is the same. 
We will need the following bound on moments of maxima of Gaussian random variables: 



Lemma 9. Let Xi, 



,Xn be standard normal random variables (not necessarily independent). Then 

i/p 



for some absolute constant C\. 
Proof. Let Z = maxj<jv Then 



for s > 1. Integrating, 



E 



¥{Z > s} < 7Vexp(-s 2 /2) 

Z\ p = J P{Z p > s} ds 

= Jp{Z p > t p }pt p - l dt 



< 1 + TV 



exp(-t 2 /2)pi F 



dt 



< 1 + Np2 p / 2 T{p/2) 

< 1 + (Np) (p p / 2 ) . 
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Thus, 

for some absolute constant C\. 
Now we may prove the lemma. 



(E|Z| p ) 1/p < dN 1 ^^. 



□ 



Proof of Lemma\^ We recall the notation from the facts in Section [21 the rows of $ are ft, a f° r t € T, 
where T is a random multiset of size n with elements in F^, and a € F*. 

To control the largest deviation of ||$x||i from its expectation, we will control the p th moments of this 
deviation — eventually we will choose p ~ ln(iV) . By a symmetrization argument followed by a comparison 
principle (Lemma 6.3 and Equation (4.8), respectively, in [LT91]), for any p > 1, 

Emax |||$a;||i -E||$a;||i| p 



E max 



teT qgf* 



< C 2 E T E„ max 



< C 2 E T E a max 



E E (K/*,a,»)|-E| </*,<*,*> |) 
r* 

E#* E i (/*.«> ^ 

tST aeF* 

(<? - !) maxE^tl ift,a,x) 



< C2(q — 1) p EtE s max max 



x££ L aeF* 



E#* (/*,«> : 



(6) 



where the c/t are i.i.d. standard normal random variables, and we dropped the absolute values because 
St I (ft,a,x) | is distributed identically to gt (/t jQ ,x). Above, we used the independence of the vectors ft, a for 
a fixed a to apply the symmetrization. 

For fixed a, let $ a denote $ restricted to the rows /t iQ that are indexed by a. Similarly, for a column 
tp(c) of $, let </?(c) a denote the restriction of that column to the rows indexed by a. Conditioning on T and 
fixing a <E F*, let 

X(x, a) := E 9t (ft, a ,x) = (g, $ a x) . 
teT 

Let Sf denote the ^ ball in R N . Since E L C LSf, we have 

$ Q (E L ) C L$ a (B?) = conv{±L<p(c) a : c e C}. 

Thus, we have 

E„ max max a)\ p = E„ max max | (g, y) \ p 

ies L aeF* j/einSi aeF*, 



< I/ P E S max max | (g, <p(c) a ) \ p , 



±c£C aeF*, 



(7) 



using the fact that max x6conv (g) F(x) = max ie s F(x) for any convex function F. Using Lemma [9j and the 
fact that (g,ip(c) a } is Gaussian with variance || tp(c)o, || | = n, 



i p E 3 max max | (g,tp(c) a ) \ p < (c x L^p{2N(q - l)) 1 '*)* 



(8) 



Together, ©, (J7J), and © imply 

E max |||$x||i - E||$a;||i| p < C 2 {q - 1) P E T (c 1 L v ^(2N(q - l)) 1/p 



< (C 2 1/p Ci(q - l)( 1+1 /^LV^P( 2iV ) 1/P 



=: Q(p)* 

Finally, we set p = ln(7V), so we have 



Q(ln(JV)) < C 3 {q - l)LVnln(JV), 
for an another constant C3. Then Markov's inequality implies 

max -E||$ir||i| > eQ(ln(iV))l < 

We conclude that with probability at least 1 — o(l), 

~ max |||$x||i -E||$x||i| < C* (g - 1) yjn ln(iV), 

for Co = eC 3 . □ 

Now we may prove Theorem^ 
Proof of Theorem [H Lemmas [7] and [SJ along with ([5]) , imply that 

\ max W^xh < n ( q ~V + C {q - l)y/n]n{N) 
L igSl v L 

with probability 1 — o(l). Thus, if 

(g - 1)^ + 0, V^MJV)) <(g-l)ne (9) 

holds, the condition (j4} also holds with probability 1 — o(l). Setting L = ( 2 /e) 2 and n = iC °^ N ^ satisfies 
©, so Lemma [B] implies that C is ((1 — 1 /q) (1 — e) ,4/e 2 )-list decodable, and the rate is 

log„(A0 e 2 



(2C ) 2 ln(g)- 

□ 



5 Generalizations 



In this section, we show that our approach above applies not just to random linear codes, but to many 
ensembles. In our proof of Theorem[2l we required only that the expectation of ||$x||i be about right, and 
that the columns of the generator matrix were chosen independently, so that Lemma [8J implies concentration. 
The fact that ||$x||i was about right followed from the condition ©, which required that, within sets A C C 
of size L, the average pairwise distance is, in expectation, large. We formalize this observation in the following 
lemma, which can be substituted for Lemma [7] 
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Lemma 10. Let C = {ci, . . . , cjv} C [q] n be a (not necessarily uniformly) random code so that for any 
A c [N] with |A| = L, 

1 



-E Y, (l-i/ 9 ){l-ri). (10) 

\2) i<jeA 



Then for all x € 



L 2 

Proof. Fix x € Sj,, and let A denote the support of re. Then, using (JTJ, 



IeH^IU < V%J1(E||^(C)||^) 1/2 



v n (« - 


1) 




- 


1) 


L 


v«(« - 


1) 




v n (« - 


1) 



1/2 



1/2 



E (g — l)n — qnd(ci,Cj 
i,jeA 



< L [L(q-l)n + 2[ ^ )n(q-l) V 



"(9-1)1 7 




as claimed. □ 

Thus, we may prove a statement analogous to Theorem [2] about any distribution on linear codes whose 
generator matix has independent columns, which satisfies (I10p . Where might we find such distributions? 
Notice that if the expectation is removed, (ITU1) is precisely the hypothesis of the average case Johnson bound 
(Theorem 8 in [CGV13] ). and so any code C to which the average case Johnson bound applies attains (fTU|) . 
However, such a code C might have substantially suboptimal rate — we can improve the rate, and still satisfy 
(|10p , by forming generator matrix for a new code £ from a random set of columns of the generator matrix 
of C. 

Definition 11. Fix a code C C [q] n , and define an ensemble £ — £{C) C [q] n as follows. To draw £ , choose 
a random multiset T of size n by drawing elements of [n'] independently with replacement. Then let 

£ = {(x tl ,. . -,x tn ) : x eC}. 

Remark 2. We may think of the operation in Definition \ll\ as randomly puncturing C. This is not quite 
correct, because the vectors tj are sampled with replacement, but it is correct in spirit. In particular, all of 
the results that follow would hold if we retained each coordinate in [n'] independently with probability n/n' , 
and this would indeed be a punctured code, with expected length n. Ignoring these technicalities, we will refer 
below to the codes of Definition \ll\ as "randomly punctured codes." 

Replacing Lemma [7] with Lemma [TO] in the proof of Theorem [5] immediately implies that randomly 
punctured codes are list decodable with high probability, if the original code C has good average distance. 

Corollary 12. Let C = {ci, . . . , cjv} C F™ be any linear code with 

\2J i<jeA v q/ 



10 



for all sets A C [N] of size L. Set 



There is some R = ^(e 2 ) so that if £ = £{C) is as in Definition\ 1 l\with rate R, then £ is ((1 — 1 /q) (1 — s) , L — 1)- 
list decodable with probability 1 — o(l). 

Theorem 8 in jCGV13j implies that if C is as in the statement of Corollary[H then C itself is ((1 - V«) (1 - e) , 0(l/e 2 ))- 
list decodable, for e as above. Thus, Corollary [T2l implies that £{C) has the same list decodability properties 
as C, but perhaps a much better rate. 

As a example of this construction, consider the family of (binary) degree r Reed-Muller codes, RM(r, to) C 
F™. RM(r, m) can be viewed as the set of degree r, m-variate polynomials over F2. It is easily checked that 
RM(r, to) is a linear code of dimension k = 1 + ('") + ( ™) + ■ ■ ■ + (™) and minimum relative distance 2 _r . 
The resulting ensemble £ = £ (RM(r, to)) is a natural class of codes: decoding £ is equivalent to learning a 
degree r polynomial over F™ from random samples, in the presence of (worst case) noise. 

We cannot hope for short list sizes in this case, but we can hope for nontrivial ones. Kaufman, Lovett, 
and Porat |KLP12| have given tight asymptotic bounds on the list sizes for RM(r, to) for all radii, and in 
particular have shown that RM(r, to) is list decodable up to 1/2 — e with list sizes on the order of £ <-( m ). 
As |RM(r, to) I is exponential in m r , this is a nontrivial bound. We will show that randomly punctured 
Reed-Muller codes, with rate ^(e 2 ), have basically the same list decoding parameters as their un-punctured 
progenitors. 

Proposition 13. Let £ = £(RM(r, to)) be as in Definition \ll\ with rate 0(e 2 ). Then £ is ( 1 /2(1 — e), L(e))- 
list decodable with probability 1 — o(l), where 

lN OA™*'- 1 ) 



L(e)-. 

where O r hides constants depending only on r. 

Proof. We aim to find r; so that (fTU|) is satisfied. As usual, let N — |RM(r, m)\. We borrow a computation 
from the proof of Lemma 6 in |CGV13j . Let A = A(e) be the number of codewords of RM(r, to) with relative 
weight at most V 2 (l — e 2 ). Let L = A/e 2 and choose a set A c [N] of size L. By linearity, for each codeword 
Cj with i G A, there are at most A — 1 codewords Cj within V 2 (l — £ 2 ) 01 c ^^ ou t of L — 1 choices for c,-. Thus, 
the sum of the relative distances over j 7^ i is at most (L — A) ■ 1 /2(1 — e 2 ). This implies 



2 v i - 1 

= \ (1 - 0(e 2 )) , 

using the choice L = A/e 2 in the final line. Thus, in Corollary IT2| we may take rj = 0(e 2 ). We conclude 
that the randomly punctured code £(RM(r, to)) of rate 0(e 2 ) is (V 2 (l ~ e ),L— 1) list decodable, with list 
size L on the order of A/e 2 . It remains to estimate A = A(e). It is shown in [KLP12] that 



A = A(e)=[~^ 

which finishes the proof. □ 
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Another popular ensemble of linear codes is the Wozencraft ensemble [Jus72, Wcl73 , which encodes 
an element x € ¥ q k as (x, a\x, a^x, . . . , a r x) for uniformly random cxj € F 2 /e. In this case, the symbols 
within a codeword are not all independent, so Lemma [TU] does not apply. However, the techniques above 
extend immediately to imply that a code from this ensemble (with r ~ k/e 2 ) is ((1 — 1 /q) (1 — e) , 0(l/e))-list 
decodable with rate e 2 /fc. (Previously, the only known result about the list decodability of the Wozencraft 
ensemble follows from the Johnson bound, which implies a rate on the order of e A for the same radius, so 
for very small e this is better). It would be interesting to see if this argument could be modified to obtain 
constant rate for the Wozencraft ensemble, or for other ensembles of linear codes. 

6 Conclusion 

We have shown that a random linear code of rate f i g( g ) ) is ((1 — (1 ~~ e ) > 0(l/e))-list decodable 
with probability 1 — o(l). Our result improves the results of |CGV13) in three ways. First, we remove the 
logarithmic dependence on e in the rate, achieving the optimal dependence on e. Second, it improves the 
dependence on q, from l/log 4 (g) to l/log(g). Finally, we show that list decodability holds with probability 
1 — o(l), rather than with constant probability. Our result is the first to establish the existence of optimally 
list decodable g-ary linear codes for this parameter regime for general q. As an added benefit, our proof 
is relatively short and straightforward. To illustrate the applicability of our argument, we showed that in 
fact our techniques apply to many ensembles of random codes, including randomly punctured codes. As an 
example, we considered Reed-Muller codes, and showed that, when randomly punctured until the rate was 
constant, they retain their combinatorial list decoding properties with high probability. 
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